Entry Name: UNCC-Tayeby-MC1

VAST Challenge 2015
Mini-Challenge 1

Team Members:

Omar ElTayeby, University of North Carolina, oeltayeb@uncc.edu PRIMARY

Wenwen Dou, University of North Carolina, wdou1@uncc.edu

Isaac Cho, University of North Carolina, choissac@gmail.com

 

Student Team: Yes

 

Did you use data from both mini-challenges? No

 

Analytic Tools Used:

D3, http://d3js.org/

 

Approximately how many hours were spent working on this submission in total?

4 weeks

 

Video:

MC1 video.

 

 

Questions

MC1.1 – Characterize the attendance at DinoFun World on this weekend. Describe up to twelve different types of groups at the park on this weekend. 

a.       How big is this type of group?

b.      Where does this type of group like to go in the park?

c.       How common is this type of group?

d.      What are your other observations about this type of group?

e.      What can you infer about this type of group?

f.        If you were to make one improvement to the park to better meet this group’s needs, what would it be?

Limit your response to no more than 12 images and 1000 words.

 

Heat map mode:

This mode contains 2 main views: the map and the time line (number of movements vs. time). The map shows the concentration of movements in the park along the 3 days and the user can select specific hours. The greenish dots represent the concentrated areas and the yellowish ones represent the less concentrated areas (image 0.1).

 

Path view mode:

The path visualization on the map shows the check-ins as black dots and the path of individual visitors as a blue line. The black lines show the paths at particular hours selected. The cluster visualization (image 0.2) is the result of applying t-SNE dimension reduction algorithm (by Laurens van der Maaten http://lvdmaaten.github.io/tsne/) on the similarity matrix resulting from Longest Common Subsequence (LCS) between each pair of visitors. We organized the check-ins of each visitor as a sequence and applied LCS to calculate the similarity of the sequences. The number of movements vs. time visualization is used to select hours along the three days.

 

Macintosh HD:Users:otayeby:Desktop:submission:heat_mode.png

Image 0.1

 

Macintosh HD:Users:otayeby:Desktop:submission:path_mode.png

Image 0.2

 

From the path mode visualization:

a. & b.

There are mainly 4 big groups. In the first group, the visitors just walked around and did not check-in to many attractions (i.e. image 1.1). Those visitors position in the clustering view are more oriented outside the center of the clusters’ centers. In the second group, the visitors check-in more into the attractions and rides at the South of the park (Coaster alley and Wet land) (i.e. image 1.2). Those visitors are more common in the green cluster on Friday. In the third group, the visitors go most of the time to the Kiddie and Tundra lands (i.e. image 1.3). In the fourth group, the visitors checked into many places all over the park (i.e. image 1.4).

 

c., d. & e.

We noticed that there are three types of groups colored on the clustering view. On Friday, when clicking on any individual visitor in the red group, we find the visitor entering from the North entrance, while visitors from the blue group enter from the West entrance, and visitors from the green group enter from the East entrance.

 

f. If we were to improve the park, we would move the Tar pit stop away from the Grinosaurus stage, since it gets crowded at the peak hours starting from the green area in the Wet land to the Tar pit stop, as shown in image 1.5. The second improvement is to extend the pavement on the green area of the Wet land to visitors who are stopping there to give space for others to pass.

 

Macintosh HD:Users:otayeby:Desktop:submission:group1.png

Image 1.1

 

Macintosh HD:Users:otayeby:Desktop:submission:group2.png

Image 1.2

 

Macintosh HD:Users:otayeby:Desktop:submission:group3.png

Image 1.3

Macintosh HD:Users:otayeby:Desktop:submission:group4.png

Image 1.4

 

Macintosh HD:Users:otayeby:Desktop:submission:sun_2pm.png

Image 1.5

 

MC1.2 – Are there notable differences in the patterns of activity on in the park across the three days?  Please describe the notable difference you see.

 

Limit your response to no more than 3 images and 300 words.

 

The 3 images 2.1, 2.2 and 2.3 show the number of movements throughout each day during the weekend.

The notable differences are as the following:

1.       The total number of visitors on Friday was 3557, Saturday was 6411, and Sunday was 7569

2.       The volume of movements increased over three days, where Sunday has the largest number of movements along time

3.       The 3 days have 2 peaks in the number of movements. On Friday and Saturday they are between 11 am and 12 pm, and 4 pm and 5 pm. While on Sunday the first peak is the same as Friday and Saturday, but its second peak is shifted between 2:30 pm and 3:30 pm

4.       Second peak on Saturday is higher than the first peak, in contrary with Friday and Sunday, the first peak is higher than the second

Macintosh HD:Users:otayeby:Desktop:submission:friMov.png

Image 2.1

 

Macintosh HD:Users:otayeby:Desktop:submission:satMov.png

Image 2.2

 

Macintosh HD:Users:otayeby:Desktop:submission:sunMov.png

Image 2.3

 

MC1.3 – What anomalies or unusual patterns do you see? Describe no more than 10 anomalies, and prioritize those unusual patterns that you think are most likely to be relevant to the crime.

 

Limit your response to no more than 10 images and 500 words.

 

The following images are snapshots from the path mode visualization:

Frist anomaly:

 

On Friday, the group of visitors with IDs: 1080969, 644885, 1781070, 521750, 1787551, 1935406, 1629516, 1600469 and 1763672 walked through the park without actually checking in to any of the attractions.

From image 3.1 we can notice the strange pattern of zigzag behavior, where the visitors approach towards multiple attractions without checking in.

The first 2 hours from 8 am till 10 am, this group went to the Grinosaurus stage (63) without checking in (as shown in image 3.2), and then their movement disappears for an hour. Lastly, they returned back to the East entrance in the 2 hours (11 am 1 pm) before they leave.

 

Second anomaly:

 

On Saturday, the group of visitors with IDs: 1737703, 1458915, 334793, 1748887, 1080969, 1600469, 1629516, 1781070, 1787551, 1935406, 521750, and 644885 repeated the same pattern of the first anomaly at the same exact hours. Notice the common IDs between both anomalies.

 

Third anomaly:

 

On Sunday, the group of visitors with IDs: 1563594, 1080969, 1600469, 1629516, 1781070, 1787551, 1935406, 521750, 644885, 921888, 1269018, 47441, 1217381, 430595, 1601276, 500084, 1711922, and 1149884 kept meeting around the Creighton pavilion several times during the first three hours then they went around the rest of the park normally and meet one last time at the pavilion before they leave (image 3.3). Notice the common IDs with the first and second anomalies.

 

Fourth anomaly:

Visitors with IDs: 655378, 1658667 and 159893 were also anomalies on Sunday observed from the cluster view. However, their path behavior is not zigzag, but they visited the Creighton Pavilion several times along the day, without checking-into the pavilion or nay other ride (image 3.4).

Image 3.1

 

 

Image 3.2

 

Image 3.3

suspect_1

Image 3.4